ca_counties <- read_sf(here("data", "ca_counties", "CA_counties_TIGER2016.shp"))
Use View(ca_counties) to check out what it contains. Let’s simplify it by only keeping two attributes: NAME (county name) and ALAND (land area), then renaming those to county_name and land_area.
ca_subset <- ca_counties %>%
select(NAME, ALAND) %>%
rename(county_name = NAME, land_area = ALAND)
Take a look at ca_subset. We should notice something very important about a simple features (sf) object: it just assumes you want to keep the spatial information, and you can work with the rest of the data as if it’s a non-spatial data frame (and the spatial information just “sticks” - hence the term “sticky geometry”). So even though we only called NAME and ALAND in the select() function, we see that the geometry column still exists!
Use st_crs() to check the existing CRS for spatial data. We see that this CRS is WGS84 (epsg: 3857).
ca_subset %>% st_crs()
## Coordinate Reference System:
## User input: WGS 84 / Pseudo-Mercator
## wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
## BASEGEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]],
## CONVERSION["Popular Visualisation Pseudo-Mercator",
## METHOD["Popular Visualisation Pseudo Mercator",
## ID["EPSG",1024]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["False easting",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["unknown"],
## AREA["World - 85°S to 85°N"],
## BBOX[-85.06,-180,85.06,180]],
## ID["EPSG",3857]]
Plot the California counties using geom_sf(). Notice that we can update aesthetics just like we would for a regular ggplot object. Here, we update the color based on land area (and change the color gradient).
ggplot(data = ca_subset) +
geom_sf(aes(fill = land_area), color = "white", size = .1) +
theme_void() +
scale_fill_gradientn(colors = c("cyan", "blue", "purple"))
sesbania <- read_sf(here("data","red_sesbania","ds80.shp"))
# Check the CRS:
sesbania %>% st_crs()
## Coordinate Reference System:
## User input: Custom
## wkt:
## PROJCRS["Custom",
## BASEGEOGCRS["NAD83",
## DATUM["North American Datum 1983",
## ELLIPSOID["GRS 1980",6378137,298.257222101,
## LENGTHUNIT["metre",1]],
## ID["EPSG",6269]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["Degree",0.0174532925199433]]],
## CONVERSION["unnamed",
## METHOD["Albers Equal Area",
## ID["EPSG",9822]],
## PARAMETER["Latitude of false origin",0,
## ANGLEUNIT["Degree",0.0174532925199433],
## ID["EPSG",8821]],
## PARAMETER["Longitude of false origin",-120,
## ANGLEUNIT["Degree",0.0174532925199433],
## ID["EPSG",8822]],
## PARAMETER["Latitude of 1st standard parallel",34,
## ANGLEUNIT["Degree",0.0174532925199433],
## ID["EPSG",8823]],
## PARAMETER["Latitude of 2nd standard parallel",40.5,
## ANGLEUNIT["Degree",0.0174532925199433],
## ID["EPSG",8824]],
## PARAMETER["Easting at false origin",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8826]],
## PARAMETER["Northing at false origin",-4000000,
## LENGTHUNIT["metre",1],
## ID["EPSG",8827]]],
## CS[Cartesian,2],
## AXIS["(E)",east,
## ORDER[1],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]],
## AXIS["(N)",north,
## ORDER[2],
## LENGTHUNIT["metre",1,
## ID["EPSG",9001]]]]
Notice that this CRS is different from the California counties CRS, so we’ll want to update it to match. Use st_transform() to update the CRS:
sesbania <- st_transform(sesbania, 3857)
# Then check it:
sesbania %>% st_crs()
## Coordinate Reference System:
## User input: EPSG:3857
## wkt:
## PROJCRS["WGS 84 / Pseudo-Mercator",
## BASEGEOGCRS["WGS 84",
## DATUM["World Geodetic System 1984",
## ELLIPSOID["WGS 84",6378137,298.257223563,
## LENGTHUNIT["metre",1]]],
## PRIMEM["Greenwich",0,
## ANGLEUNIT["degree",0.0174532925199433]],
## ID["EPSG",4326]],
## CONVERSION["Popular Visualisation Pseudo-Mercator",
## METHOD["Popular Visualisation Pseudo Mercator",
## ID["EPSG",1024]],
## PARAMETER["Latitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8801]],
## PARAMETER["Longitude of natural origin",0,
## ANGLEUNIT["degree",0.0174532925199433],
## ID["EPSG",8802]],
## PARAMETER["False easting",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8806]],
## PARAMETER["False northing",0,
## LENGTHUNIT["metre",1],
## ID["EPSG",8807]]],
## CS[Cartesian,2],
## AXIS["easting (X)",east,
## ORDER[1],
## LENGTHUNIT["metre",1]],
## AXIS["northing (Y)",north,
## ORDER[2],
## LENGTHUNIT["metre",1]],
## USAGE[
## SCOPE["unknown"],
## AREA["World - 85°S to 85°N"],
## BBOX[-85.06,-180,85.06,180]],
## ID["EPSG",3857]]
Cool, now they have the same CRS. Plot them together!
Note: this may take a minute.
ggplot() +
geom_sf(data = ca_subset) +
geom_sf(data = sesbania, size = 1, color = "red")
Let’s say we want to find the count of red sesbania observed locations in this dataset by county. How can I go about joining these data so that I can find counts? Don’t worry…st_join() has you covered for spatial joins!
ca_sesbania <- ca_subset %>%
st_join(sesbania)
And then we can find counts (note: these are not counts for individual plants, but by record in the dataset) by county:
sesbania_counts <- ca_sesbania %>%
count (county_name)
Then we can plot a chloropleth using the number of records for red sesbania as the fill color (instead of what we used previously, land area):
ggplot(data = sesbania_counts) +
geom_sf(aes(fill = n), color = "white", size = .1) +
scale_fill_gradientn(colors = c("lightgray","orange","red")) +
theme_minimal() +
labs(fill = "Number of S. punicea records")
So we see that we can still use our usual wrangling skills! Let’s do a bit more for fun, just to prove that our existing wrangling skills still work with spatial data - the spatial information just sticks to it! Only plot the county with the greatest number of red sesbania records (Solano), and make a map of those locations (yeah there are many ways to do this):
# Subset of sesbania point locations only in Solano County
solano_sesbania <- sesbania %>%
filter(COUNTY == "Solano")
# Only keep Solano polygon from California County data
solano <- ca_subset %>%
filter(county_name == "Solano")
ggplot() +
geom_sf(data = solano) +
geom_sf(data = solano_sesbania)
Sometimes we’ll want to make a map interactive so that audience members can zoom in, explore different areas, etc. We can use the {tmap} package to create an interactive map. Let’s make one for our California counties (fill aesthetic by land area) with the red sesbania locations on top:
# Set the viewing mode to "interactive":
tmap_mode(mode = "view")
## tmap mode set to interactive viewing
# Then make a map (with the polygon fill color updated by variable 'land_area', updating the color palette to "BuGn"), then add another shape layer for the sesbania records (added as dots):
tm_shape(ca_subset) +
tm_fill("land_area", palette = "BuGn") +
tm_shape(sesbania) +
tm_dots()